Multimodal Stereo from Thermal Infrared and Visible Spectrum
نویسندگان
چکیده
Recent advances in thermal infrared imaging (LWIR) has allowed its use in applications beyond of the military domain. Nowadays, this new family of sensors is included in different technical and scientific applications. They offer features that facilitate tasks, such as detection of pedestrians, hot spots, differences in temperature, among others, which can significantly improve the performance of a system where the persons are expected to play the principal role. For instance, video surveillance applications, monitoring, and pedestrian detection. During the dissertation the next question is stated: Could a couple of sensors measuring different bands of the electromagnetic spectrum, as the visible and thermal infrared, be used to extract depth information? Although it is a complex question, we shows that a system of these characteristics is possible as well as their advantages, drawbacks, and potential opportunities. In this research an experimental study that compares different cost functions and matching approaches is performed, in order to build a multimodal stereovision system. Furthermore, the common problems in infrared/visible stereo, specially in the outdoor scenes are identified. Our framework summarizes the architecture of a generic stereo algorithm, at different levels: computational, functional, and structural, which can be extended toward high-level fusion (semantic) and high-order (prior). The proposed framework is intended to explore novel multimodal stereo matching approaches, going from sparse to dense representations (both disparity and depth maps). Moreover, context information is added in form of priors and assumptions. Finally, the dissertation shows a promissory way toward the integration of multiple sensors for recovering three-dimensional information. The dissertation covers the main aspects of a multimodal stereo system: camera setup, matching cost functions, and disparity computation. First part presents several experiments carry on with different camera configurations. As a tangible result, two multimodal datasets and their corresponding ground truth data were acquired and published. These datasets consist of: (i) thermal infrared and visible images in raw format as well as their rectified versions; (ii) disparity maps; (iii) 3D point clouds; (iv) hand annotated planar regions; (v) synthesized disparity maps; and (vi) labeled image regions (valid and occluded image regions). Up to our knowledge there are not similar datasets available for evaluation and comparisons. Second part presents a study of different matching cost functions proposed during this dissertation. Finally, two dense stereo matching algorithms for
منابع مشابه
TRANSFER REPORT Thermal Infrared and Visible Spectrum Fusion for Multi-modal Video Analysis
While traditional image and video processing focus on extracting knowledge from data of a single modality, such as visual spectrum or thermal infrared video, this report investigates the benefits and challenges of capturing and analysing multimodal video. It specifically targets the two modalities of visible spectrum and thermal infrared video. A novel capture device has been developed to captu...
متن کاملThermal / Visible Stereo Vision for Electric Power Systems Autonomous Monitoring Systems
Thermography is a technique widely used for inspection of electrical equipment’s operating conditions. However, its operation, predominantly manual, hampers more reliable diagnosis of component’s abnormality indications. This paper presents a straightforward hybrid stereo vision configuration for autonomous monitoring systems that uses images visible and infrared spectrum to identify, character...
متن کاملLocal self-similarity-based registration of human ROIs in pairs of stereo thermal-visible videos
For several years, mutual information (MI) has been the classic multimodal similarity measure. The robustness of MI is closely restricted by the choice of MI window sizes. For unsupervised human monitoring applications, obtaining appropriate MI window sizes for computing MI in videos with multiple people in different sizes and different levels of occlusion is problematic. In this work, we apply...
متن کاملOn Cross-Spectral Stereo Matching using Dense Gradient Features
Here we address the problem of scene depth recovery within cross-spectral stereo imagery (each image sensed over a differing spectral range). We compare several robust matching techniques which are able to capture local similarities between the structure of cross-spectral images and a range of stereo optimisation techniques for the computation of valid dense depth estimates for this case. As th...
متن کاملMultimodal Stereo Vision Using Mutual Information with Adaptive Windowing
This paper proposes a method for computing disparity maps from a multimodal stereovision system composed of an infrared and a visible camera pair. The method uses mutual information (MI) as the basic similarity measure where a segmentation-based adaptive windowing mechanism is proposed for greatly enhancing the results. On several datasets, we show that (i) our proposal improves the quality of ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2014